Conversation
…he previous value associated with key
2bf8d81 to
e40e293
Compare
|
| mergeValuesInto(merged, action.expressionValues()); | ||
| } | ||
|
|
||
| return merged.isEmpty() ? Collections.emptyMap() : Collections.unmodifiableMap(merged); |
There was a problem hiding this comment.
Is this necessary? Why not just returned Collections.unmodifiableMap(merged) unconditionally?
There was a problem hiding this comment.
It's not necessary but we can avoid wrapping an empty map with UnmodifiableMap this way.
| String.format("Attempt to coalesce two expressions with conflicting expression names. " | ||
| + "Expression name key = '%s'", key)); |
There was a problem hiding this comment.
No, but I can see why you'd think that. It's a bit hard to read the diff.
In the current code streamOfExpressionNames calls joinNames() which has the same validation.
In the new code I just moved the validation here.
We also have an existing test covering this specific codepath
|
This pull request has been closed and the conversation has been locked. Comments on closed PRs are hard for our team to see. If you need more assistance, please open a new issue that references this one. |



Overview
This PR targets two bottlenecks in
UpdateExpressionConverter.toExpressionas part of the Update operation request pipeline.toExpressionscales and becomes the most noticable bottleneck the bigger the input is: ~40% cpu time in SMALL datasets, and ~74% in HUGE_FLAT datasets.Interactive Flamegraph
Testing
We have fairly good test coverage already:
convert_mixedActions_uniqueNameTokensandconvert_mixedActionsverify the string output for correctness. This PR changes the underlying implementation but the API surface remains unchanged.1. String.format overhead (~18% CPU time)
groupExpressionsuses String.format to build the update expression for each attribute. Can replace with a StringJoiner which avoids the formatter's overhead.2. Nested
Stream.concat()inmergeExpressionNames()andmergeExpressionValues()(15-50% CPU time)Current implementation:
The problem is that
Stream.reduce()operation callsExpression.joinNames()for each action type (SET, ADD, REMOVE, DELETE). This method creates a new HashMap and copies all existing entries on every iteration:Example: With N attributes distributed across 4 action types, Stream.reduce() creates 3 intermediate HashMaps. For example, with 20 attributes (5 per action type), it copies 5 entries, then 10, then 15, before producing the final 20 entry result. The cost scales poorly with dataset size.
Instead we can iterate directly on each action and use a single HashMap:
This eliminates intermediate HashMap allocations and theredundant entry copying. The same proposed optimization also applies to
mergeExpressionValues().Results:
toExpressionwent from ~40% CPU time to ~15%groupExpressionswent from 18% CPU time to ~2.5%mergeExpressionValuesandmergeExpressionNameswent down from 18% to ~4%